7 research outputs found

    The Replica Consistency Problem in Data Grids

    Get PDF
    Fast and reliable data access is a crucial aspect in distributed computing and is often achieved using data replication techniques. In Grid architectures, data are replicated in many nodes of the Grid, and users usually access the "best" replica in terms of availability and network latency. When replicas are modifiable, a change made to one replica will break the consistency with the other replicas that, at that point, become stale. Replica synchronisation protocols exist and are applied in several distributed architectures, for example in distributed databases. Grid middleware solutions provide well established support for replicating data. Nevertheless, replicas are still considered read-only, and no support is provided to the user for updating a replica while maintaining the consistency with the other replicas. In this thesis, done in collaboration with the Italian National Institute of Nuclear Physics (INFN) and the European Organisation for Nuclear Research (CERN), we study the replica consistency problem in Grid computing and propose a service, called CONStanza, that is able to synchronise both files and heterogeneous (different vendors) databases in a Grid environment. We analyse and implement a specific use case that arises in high energy Physics, where conditions databases are replicated using databases of different makes. We provide detailed performance results, and show how CONStanza can be used together with Oracle Streams to provide multitier replication of conditions databases using Oracle and MySQL databases

    Simulazione di modelli di consistenza per file replicati su sistemi Grid

    Get PDF
    Lo scopo di questa tesi è quello di analizzare i problemi di consistenza dei file replicati su sistemi Grid e di progettare, simulare e mettere a confronto alcune possibili soluzioni. Una delle caratteristiche dei sistemi Grid è quella di consentire l'accesso semplice,sicuro e coordinato, ad una quantità di dati enorme (dell'ordine dei PetaByte) distribuiti nei vari nodi del sistema e di mettere a disposizione un'adeguata potenza di calcolo per elaborarli. Per migliorare l'accesso ai dati si ricorre a tecniche di replicazione che, se da un lato producono grossi vantaggi, dall'altro incrementano la mole di dati da gestire e introducono nuove problematiche di gestione, come quello della consistenza. Se infatti ammettiamo che un utente possa modificare una replica di un dato, abbiamo bisogno anche di meccanismi per sincronizzare le altre repliche. Dopo uno approfondito studio del problema, vedremo alcune possibili soluzioni ed effettueremo delle simulazioni per valutare il loro impatto sul sistema

    A performance study on the synchronisation of heterogeneous Grid databases using CONStanza

    No full text
    In Grid environments, several heterogeneous database management systems are used in various administrative domains. However, data exchange and synchronisation need to be available across different sites and different database systems. In this article we present our data consistency service CONStanza and give details on how we achieve relaxed update synchronisation between different database implementations. The integration in existing Grid environments is one of the major goals of the system. Performance tests have been executed following a factorial approach. Detailed experimental results and a statistical analysis are presented to evaluate the system components and drive future developments. (C) 2010 Elsevier B.V. All rights reserved

    Relaxed Data Consistency with CONStanza

    No full text
    Data replication is an important aspect in a Data Grid for increasing fault tolerance and availability. Many Grid replication tools or middleware systems deal with read-only files which implies that replicated data items are always consistent. However, there are several applications that do require updates to existing data and the respective replicas. In this article we present a replica consistency service that allows for replica updates in a single-master scenario with lazy update synchronisation. The system allows for updates of (heterogeneous) relational databases, and it is designed to support flat files as well. It keeps remote replicas synchronised and partially (“lazily”) consistent. We report on the design and implementation of a novel “relaxed” replica consistency service and show its usefulness in a typical application use case

    Replica consistency in a Data Grid

    No full text
    A Data Grid is a wide area computing infrastructure that employs Grid technologies to provide storage capacity and processing power to applications that handle very large quantities of data. Data Grids rely on data replication to achieve better performance and reliability by storing copies of data sets on different Grid nodes. When a data set can be modified by applications, the problem of maintaining consistency among existing copies arises. The consistency problem also concerns metadata, i.e., additional information about application data sets such as indices, directories, or catalogues. This kind of metadata is used both by the applications and by the Grid middleware to manage the data. For instance, the Replica Management Service (the Grid middleware component that controls data replication) uses catalogues to find the replicas of each data set. Such catalogues can also be replicated and their consistency is crucial to the correct operation of the Grid. Therefore, metadata consistency generally poses stricter requirements than data consistency. In this paper we report on the development of a Replica Consistency Service based on the middleware mainly developed by the European Data Grid Project. The paper summarises the main issues in the replica consistency problem, and lays out a high-level architectural design for a Replica Consistency Service. Finally, results from simulations of different consistency models are presented
    corecore